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Abstract —This paper focuses on the coordination of a popula¬ 
tion of thermostatically controlled loads (TCLs) with unknown 
parameters to achieve group objectives. The problem involves 
designing the device bidding and market clearing strategies 
to motivate self-interested users to realize efficient energy 
allocation subject to a peak energy constraint. This coordination 
problem is formulated as a mechanism design problem, and we 
propose a mechanism to implement the social choice function 
in dominant strategy equilibrium. The proposed mechanism 
consists of a novel bidding and clearing strategy that incorpo¬ 
rates the internal dynamics of TCLs in the market mechanism 
design, and we show it can realize the team optimal solution. A 
learning scheme is proposed to address the unknown load model 
parameters. Numerical simulations are performed to validate 
the effectiveness of the proposed coordination framework. 

Index Terms —Mechanism design, demand response, market- 
based coordination, thermostatically controlled loads 

I. Introduction 

Demand response has attracted considerable research at¬ 
tention in recent years, and is regarded as one of the most 
important means to improve the efficiency and reliability 
of the future smart grid. A natural way to achieve demand 
response is through various pricing schemes, such as Real 
Time Pricing (RTP), Time of Use (TOU) and Critical Peak 
Pricing (CPP) 0, |2]|. Many validation projects 0 have 
been carried out to demonstrate the performance of these 
pricing schemes in terms of payment reduction, load shifting, 
and peak shaving. These price-based methods either directly 
pass the wholesale energy price to end-users |2l or design 
pricing strategies in heuristic ways II- It is thus hard to 
achieve predictable and reliable aggregated response, which 
is essential in various demand response applications, such as 
energy capping, load following, frequency regulation, among 
others. 

To achieve accurate and reliable load response, aggregated 
load control has been extensively studied in the literature. 
A simple form of aggregated load control is the direct load 
control (DLC), where the aggregator can remotely control the 
operations of residential appliances based on the agreement 
between customers and the utility company. While traditional 
DLC is mainly concerned with peak load management J3, 
0, recent research effort focuses more on the modeling and 
control of different kinds of aggregated loads, such as data 
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center servers 0 , ®, hybrid electrical vehicles 0 , ma 
and thermostatically controlled loads EMI!, to participate 
in various demand response programs. Some of these DLC 
methods require fast communications between the aggregator 
and individual loads. The communication overhead can be 
reduced using advanced state estimation algorithms El, El 
that can accurately estimate load state information without 
frequently collecting measurements from the loads. 

Another important paradigm of aggregated load control is 
the market-based coordination. It borrows ideas from eco¬ 
nomics El to coordinate a group of self-interested users to 
achieve desired aggregated load response H8l , fl9l . Different 
from DLC, the market-based coordination affects the load 
response indirectly via an internal price signal. The internal 
price can be dramatically different from the wholesale price 
due to specific group objectives. For instance, in 11201 and 
El, a market-based approach is proposed to efficiently 
allocate thermal resources among offices only based on local 
information. In ll22l and |23l . a multi-agent based control 
framework is proposed to integrate distributed energy re¬ 
sources for various coordination objectives. A distributed al¬ 
gorithm is developed in 12! and 1251 for the utility company 
and users to jointly determine optimal prices and demand 
schedules via an iterative bidding and clearing process. In 
ll 26 l . a group of smart buildings are coordinated through an 
internal price signal to provide frequency regulation services 
to the ancillary market. In addition, the Pacific Northwest 
National Laboratory launched the GridWise® demonstration 
project to validate the market-based coordination strategies 
for residential loads lt27l l. The demonstration project in¬ 
volved 112 residential houses in Washington and Oregon, and 
showed that the market-based coordination strategies could 
reduce the utility demand and congestion at key times. 

Although the aggregated dynamics of TCLs may signifi¬ 
cantly affect the performance of the control strategies, many 
existing market-based coordination strategies either neglect 
this internal dynamics or use a simplified model to character¬ 
ize it. In this paper, we consider the coordination of a group 
of TCLs to maximize the social welfare subject to a peak 
energy constraint, where the internal dynamics of TCLs are 
taking into account. This coordination problem poses several 
challenges. First, the user utilities are private information, 
making it rather challenging for the coordinator to achieve 
group objectives with incomplete information. Second, many 
existing works |25l . 1281 require multiple iterations between 
the agents and the coordinator to achieve the optimal social 
outcome. The real time implementation of such coordination 


algorithms requires considerable communication resources. 
Third, a lot of existing literature assumes accurate load 
models with known parameters. However, the Gridwise® 
demonstration project [27] suggests this is not always the 
case. In practice, the information each user sends to the 
coordinator can only depend on local measurements, such as 
room temperature and “on/off’ state. Therefore, an estimation 
scheme is needed for the users to compute their bids only 
based on online measurements. 

The key contribution of this paper lies in the development 
of a market-based coordination framework for residential air 
conditioning loads with a systematic consideration of all the 
aforementioned challenges. In this paper, we formulate the 
coordination problem as a mechanism design problem G3, 
(29). The price-responsive loads are modeled as individual 
utility maximizers, while the group objective is encoded in 
the social choice function, which is to maximize the social 
welfare subject to a peak energy constraint. We propose a 
mechanism and show it can implement the social choice 
function in dominant strategy equilibrium. Such solution 
concept does not require iterative information exchanges 
between the coordinator and the individual loads, and can 
be implemented with limited communication resources. The 
proposed mechanism contains a novel bidding and clearing 
strategy that incorporates the internal dynamics of the TCLs 
into the market mechanism design, and we show that it can 
realize the team optimal solution. 

Different from many existing works, the problem is ad¬ 
dressed with a systematic consideration of various practical 
factors, such as heterogeneous load dynamics, private in¬ 
formation of individual users, unknown parameters of the 
load model, communication resources for the information 
exchange, etc. All these factors are brought up based on the 
observations in the GridWise® demonstration project (27). 
They are important not only for customer privacy protection 
and the end user engagement, but also for the cost-effective 
implementation of the real-time control strategies. Once our 
framework is properly implemented, it can accurately achieve 
the desired load responses, and improve the operational 
efficiency of the distribution system in an economically 
feasible way. 

The rest of the paper proceeds as follows. A motivating 
example based on a real-world demonstration project is 
presented in Section II, followed by a problem formulation 
in Section III. A mechanism is constructed in Section IV 
to implement the optimal energy allocation. A joint state- 
parameter estimation framework is presented in Section V, 
followed by simulations results in Section VI and some 
concluding remarks in Section VII. 

II. Motivating Example 

The framework proposed in this paper is largely motivated 
by the Pacific Northwest GridWise® demonstration project 
(27), where a 5-minute double-auction market is created to 
coordinate a group of TCLs to cap the aggregated peak 
energy. Each device is equipped with a smart thermostat 
that can measure the room temperature and communicate 


with the coordinator. Before each market period, the device 
measures its room temperature, T c , and submits a bid to the 
coordinator. The bid should consist of the load power and the 
bidding price. Since the rated power of the load is different 
from its actual power due to environmental disturbances, in 
practice each device is required to bid the measured average 
power of the most recent market period during which the load 
is on. The bidding price is determined by a bidding curve 
shown in Fig. [T] where P avg is the average clearing price 
of certain price history (e.g., 24 hours), a is the standard 
variation of the clearing prices during the given history, 
and T m i n , T des ired and T max are user-specified minimum, 
desired, and maximum temperature, respectively. We denote 
the bidding power and price as Qud and /),„/, respectively. 
In addition, each user can specify energy use preferences 
through a smart thermostat interface (see Fig. 0. This user 
preference will affect the slope of the bidding curve. 

The coordinator collects all the bids and orders the bids in 
a decreasing sequence, P^ idl ■ ■ ■, P^d- With the associated 
power sequence, Ql id ,..., Q^ id , a demand curve can be 
constructed to map the clearing price to aggregated power. 
Fig. 0 illustrates how the demand curve is constructed. This 
curve is then used to determine the market clearing price 
that respects the feeder capacity constraint: when the total 
demand is less than the feeder capacity, the market clearing 
price is equal to the base price. Phase (Fig- 0, which is the 
wholesale energy price plus a retail modifier as defined by 
the tariff of American Electric Power (AEP) |3Q| ; otherwise 
the market price, P c , is determined by the intersection of the 
demand curve and the feeder capacity constraint (Fig. 0). 

After the market is cleared, each device receives the 
energy price and adjusts its setpoint, T set , according to a 
response curve as shown in Fig. [6] This setpoint modifies 
the system dynamics and affects the temperature trace of the 
TCL, and therefore affects the bid of each user for the next 
market period. Notice that all the bidding and user response 
processes are executed by a programmable controller, and 
the user only needs to specify his/her preferences via the 
thermostat interface. To initialize the market process, the user 
needs to specify T rnln , T max , T desired and K, the device 
needs to measure the temperature and the power of the last 
“on” cycle, and the coordinator needs to collect all the bids, 
estimate the power of the unresponsive loads, Q uc , and the 
feeder capacity constraint, D. 

Apart from the GridWise® project, a similar demonstra¬ 
tion project is also implemented in AEP, Ohio (311 . which 
involves more households and more sophisticated market 
bidding design. These projects provide insights for the coor¬ 
dination of residential loads from the practical point of view. 
However, the bidding and pricing strategies are designed in a 
heuristic way, which may result in constraint violations and 
market inefficiencies. To address these challenges, there is 
a strong need to develop a general coordination framework 
that can serve as a theoretical foundation to improve the 
performance of the control scheme and help to design other 
similar market-based coordination strategies. 



Figure 1. The controller measures its current 
temperature T c and submits a bid Pbid 1° the 
coordinator using this curve. 



Figure 4. The demand curve constructed based 
on all the bids. If the total demand is less than the 
feeder capacity constraint, then the dealing price 
is equal to the base price. 
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Figure 2. User interface used in the GridWise® 
demonstration project (27) 



Figure 5. When the total demand is greater than 
the feeder power constraint, then the clearing price 
is determined by the intersection of demand curve 
and feeder capacity constraint. 



Figure 3. The demand curve based on the user 
bids, where is the bidding price sequence in 
decreasing order, and P ower °f the 

most recent on cycle. 



Figure 6. The user response to the price. For any 
given price, the devices determine the temperature 
setpoint according to this curve. 


III. Problem Formulation 

Consider a coordination problem for a group of TCLs, 
where the coordinator allocates energy to users to maximize 
the social welfare subject to a total energy constraint. Each 
device is assumed to be equipped with a smart thermostat that 
has two main functions. First, it allows the user to specify 
energy use preferences via an interface such as the sliding bar 
shown in Fig. [2] to indicate one’s trade-off between comfort 
and cost. Second, before each market period it submits a bid 
to the coordinator based on user’s preference and local device 
measurement, such as power consumption, “on/off' states, 
and local temperature. The coordinator collects the user bids, 
determines the energy price, and broadcasts the price to all 
the devices. Each device will then adjust the temperature 
setpoint in response to the energy price to maximize the 
individual utility. This will modify the system dynamics and 
therefore affect the user bids for the next period. In the 
considered scenario, we assume that each user is a price taker, 
namely, an individual user’s decision will not significantly 
affect the market price. This is a standard assumption when 
the market involves a large number of players ini chap. 
12.F], (32), (33). 

The rest of this section provides formal mathematical 
descriptions of the main components of the proposed frame¬ 
work. 

A. User Preferences and Utility 

Assume that there are N self-interested users. Each user 
needs to determine the temperature setpoint to obtain an 
energy allocation that maximizes his individual utility (the 
user’s comfort minus the electricity cost). In other words, 


each user is confronted with the trade-off between comfort 
and electricity cost: when the electricity price is high, the 
device will adjust the temperature setpoint to save electricity 
cost at the sacrifice of user comfort. Formally, a function 
Vi : R —>• R can be used to represent the comfort level for 
each user with energy allocation aj. Assume that Vi{af) 
is concave, continuously differentiable, Vj(0) = 0 and 
Vj'(0) > 0. Fet 9i(tk) represent the private information 
of user i. Denote E™ as the energy consumption for the 
ith load if it is “on” during the entire period, which gives 
a, < E™. The individual utility maximization problem can 
be formulated as follows: 

max Vdai] 6i(tk)) - P c a% (1) 

Cli 

subject to: 0 < a* < E™, 

where P c is the energy price. Fet hi : M —> R be the optimal 
solution to the optimization problem {]]), we have: 

hi(P c -,0i(t k )) = argmax V^a*; 9i(t k )) - P c ai. (2) 

0 < a i<E™ 

We assume that /i, is continuous and non-increasing with 
respect to P c for each i = 1,..., N. Notice that the user can 
not directly choose his optimal energy allocation. Instead, he 
can only determine the temperature setpoint, which affects 
the energy consumption through the load dynamics. 

B. Individual Load Dynamics 

Fet t]i{f) e R" be the continuous state of the zth load. 
Denote qi(t) as the “on/off’ state: q-ft) = 0 when the TCF 
is off, and g,(f) = 1 when it is on. For both “on” and “off” 


















































states, the thermal dynamics of a TCL system can be typically 
modeled as a linear system: 


Vi{t) = 


AiVi(t ) 


on 

n 

off 


if qi(t) = 1 
if qi{t) = 0. 


(3) 


Many existing works use a first-order linear system to capture 
the TCL dynamics fill . fl5l . llT6l . where only consists 
of the room temperature. Although the first-order model 
is adequate for small TCLs such as refrigerators, it is not 
appropriate for residential air conditioning systems, which 
require a 2-dimensional linear system model incorporating 
both air and mass temperature dynamics CD- Such a second- 
order model is typically referred to as the Equivalent Thermal 
Parameter (ETP) model li34l . In this paper we focus on 
the second-order ETP model, which includes the first-order 
model as a special case. Let <pi = [Ai, B l on , be the 

model parameters. Typical values of these parameters and the 
factors that affect these parameters can be found in [1 3~2| . 

The power state of the TCL is typically regulated by a 
hysteretic controller based on the control deadband [ui{t) — 
S/2,Ui(t) +6/ 2], where Ui(t) is the temperature setpoint of 
the ?'th TCL and <5 is the deadband. Let T*(t) denote the room 
temperature of the ith load. In the cooling mode, the load 
is turned off when T/(t) < Ui(t) — S/2, and it is turned on 
when T/(t) > Ui(t) + 5/ 2, and remains the same power state 
otherwise. This hysteretic control policy can be described as: 


1 if T*(t) > Ui (t) + S/2 
%(f + ) = 'j0 if T/{t) < Ui (t)-S/2 
qi(t ) otherwise . 


(4) 


For notation convenience, we define a hybrid state Zi(t) = 
[i]i(t), qi(t)] T , which consists of both the temperature and 
the “on/off’ state of the load. Let [t k ,t k + T] be the fcth 
market period, then the energy consumption of each load 
during the fcth period depends on the system state and 
setpoint control Ui(t). In this case, the private information 
consists of system state and model parameters. Therefore, 
the energy consumption of each load can be represented as 
ei(ui(t k ), Zi{t k ), tpi). This energy consumption function can 
be derived by calculating the portion of time that the system 
is on over the entire market period (details of this calculation 
are presented in Section IV). An example is shown in Fig 
|7] where a second-order ETP model is used and the initial 
room temperature is 72.8°F. Let 6 l (t k ) = ( Zi(t k ),ipi ) be 
the overall private information of load i, then the energy 
function can be written as ei(ui(t k ), 9i(t k )). Notice that the 
private information for users is time varying, as it contains 
the system state. 

After the market is cleared, each user wants to determine 
the control action Ui(t k ) such that the resulting energy 
consumption equals the optimal solution to £□>. Since the 
optimal control depends on the energy price, we can define 
a user response function, : R —>• R with Ui(t k ) = A i(P c ). 
Therefore, the optimal energy allocation function hi as 
defined in © should satisfy the following: 


M-; 0i{t k )) = ei(A ,(•), 6i{t k )). (5) 



Figure 7. Energy consumption of the TCL during a market clearing cycle 
as a function of the temperature setpoint. 


The left-hand side of equation © represents the optimal 
energy allocation for a given price, while the right-hand side 
arises from the physical property of the individual loads, 
and indicates that the user can specify the control action 
Ui to match the actual energy consumption to the optimal 
allocation. An example of function hi is shown in Fig. [8j 
where the response curve is piecewise linear (as shown in Fig. 
03 and the initial room temperature is 72.8°F. To derive the 
function hi{-\9i{t k )), we first determine the control setpoint 
based on the market price using the response curve (Fig. Q}. 
then calculate the corresponding energy consumption based 
on the energy function ei(-,9i(t k )). Since the energy function 
e,(-,0i(ffc)) depends on the system dynamics © and the 
control policy 0, the load dynamics are incorporated in 
function hi through this process. 

C. Problem Statement 

The coordinator obtains energy from the wholesale market 
at a cost denoted as C We assume that C(-) 

is differentiable and convex. The energy is then allocated 
to users via a price signal to maximize the social welfare, 
which can be defined as XliLi Vi( a i; 9i(t k )) - C(J2iLi )• 
Therefore, the coordinator’s optimization problem can be 
formulated as follows: 

N / N \ 

max V' Vi{ai]9i(t k )) - C ( ] (6) 

i=l \i=1 / 

[ Eilt ai<D 

subject to: < 0 < a, < E™, V* = 1,..., N 

(a, = hi(P c ;9i(t k )),\/i = 1,...,N, 

where D is the maximum energy for the aggregated loads. 
Without loss of generality, we assume that D < NE™. 
Note that the feeder capacity constraints considered in the 
GridWise® demonstration project can be represented by the 
total energy constraint. This is because the feeder capacity 
constraint is mainly due to the consideration of the thermal 
characteristics of the feeder. The instantaneous power can 
exceed the feeder power limit without causing damages to the 
grid, as long as the energy over a certain period is effectively 
capped to protect the feeder from overheating. 

The optimization problem © defines a Stackelberg game 
ED, where the coordinator first makes control decisions to 
maximize the social welfare, then the individual users choose 






Figure 8. Energy consumption of the TCL during a market clearing cycle 
as a function of the energy price. 


energy consumption to maximize individual utility based 
on the coordinator’s control decisions. In such Stackelberg 
games, the upper bound on the social welfare can be typically 
characterized by the team optimal solution ED, which is the 
optimal solution to the following team problem: 

N f N 

max V V l (a l ; 9i{t k )) - C a* 

ai,...,ajv ^' \ z ' 

i= 1 \i =1 

subject to: 

[0<a i <E! n ,Vi=l,...,N, 

In the above team problem, the coordinator and the users 
cooperatively maximize the social welfare subject to the peak 
energy constraint. In general, the team solution results in 
a higher social welfare than the solution to (fhj, since the 
coordinator’s optimization problem © is more restrictive: 
one only needs to find an energy allocation to maximize 
the social welfare to solve the team problem, while in the 
coordinator’s optimization problem, we also need to find a 
price to satisfy the additional constraint in However, such 
a clearing price may not always exist for an arbitrarily given 
team optimal solution. 

Example 1: As an example, consider two users with 
Vi(ai;0i(tfc)) = ai, V 2 {a 2 \ 9 2 (t k )) = 3a 2 . The energy cost 
for the coordinator is C{a\ + a 2 ) = 2a\ + 2 a 2 . The team 
problem is to maximize the social welfare subject to an 
energy constraint, i.e.: 

2 

max V'' Vi{ap, 9i(t k )) — C(a± + a 2 ) (8) 

ai,a,2 z ' 
i= 1 

subject to: / ai + ° 2 ^ 1 

[0 < ai < 2, for i = 1,2 

The team optimal solution is ai = 0, a 2 = 1. However, 
according to 0}. given any energy price, a* is either 0 or 2. 
Therefore, the coordinator can not find a price to realize the 
team optimal solution. 

To address this concern, we introduce the concept of 
realizable energy allocation: 

Definition 1: The energy allocation vector, 
a = (ai,..., ajv), can be realized by P c , if 
a t = hi(P c ; 9i(t k )) for all i = 1,... ,7V. 

It is clear that not all the energy allocations can be realized. 
In this paper, we have assumed that Vi is concave and 



continuously differentiable, and hi is continuous and non¬ 
increasing. We will show in Section V that under these 
conditions, there is always a price to realize the team optimal 
solution. In other words, the upper bound given by the team 
optimal solution is tight. Therefore, the problem of the paper 
can be formulated as follows: 

Problem 1: Design the bidding and clearing strategy, such 
that the cleared price realizes the team optimal solution a*. 

The coordinator’s optimization problem ([6]) can not be 
directly addressed using standard optimization techniques, 
since the individual valuations are unknown to the coordi¬ 
nator. For this reason, to achieve the group objectives, the 
coordinator needs to design a bidding strategy to collect 
information from the individual users, and then determine 
the price based on the user bids. 

Remark 1: The market design for many traditional assets 
is well-understood. For instance, in energy market, generators 
can be simply characterized by an output range depending on 
its ramp rate during each market period. However, the internal 
dynamics of TCLs are more complex and depend more on 
the environment, and thus cannot be handled in the same 
way. Therefore, an important contribution of this paper is to 
incorporate the dynamics of TCLs in the energy market de¬ 
sign. In addition, although this paper only considers the load 
dynamics within one market period, it is an important step 
towards establishing a fully dynamic version of the problem 
where multiple market periods are taken into account. 

IV. A Mechanism Design Framework 

In this section, we adopt the mechanism design approach 
to solve Problem 1. First the problem is formulated as a 
mechanism design problem, then a mechanism is constructed 
to implement the desired social outcome. In addition, a 
realistic bidding strategy with a simplified message space is 
proposed to reduce the communication overhead. 

A. The Mechanism Design Problem 

Mechanism design studies how to aggregate the individual 
preferences into a social choice while the individual’s actual 
preferences are not publicly observable. In a mechanism 
design problem, each user is assumed to selfishly take actions 
to maximize the individual utility, while the coordinator 
makes the collective choice that achieves various group 
objectives. Since the individual utility is unknown to the 
coordinator, he can require each user to submit a bid to collect 
information. In this case, the key problem for the coordinator 
is to align individual objectives with system-level objectives. 
In other words, a proper bidding and pricing strategy needs 
to be designed, such that when each user selfishly maximizes 
the individual utility, the resulting outcome also achieves the 
desired group objectives (for example, maximizes the social 
welfare). The rest of this subsection introduces basic concepts 
in mechanism design. 

Let x G X be the outcome of the mechanism that 
consists of the energy allocation and the energy price, i.e., 
x = (ai,..., ajv, Pc). The utility of each user (comfort 
minus electricity cost) depends on the outcome. Moreover, 




we assume that at time t k , each user can privately observe 
his utility, Ui, over different outcomes. In other words, we 
can model this by supposing that user i privately observes a 
parameter 9i that determines his utility. Notice that we drop 
the dependence of 0,; on tk throughout the rest of the paper 
for notation convenience. In mechanism design, 9i £ 0, is 
usually referred to as the user i’ s type O p. 858], where 
0, denotes the set of all the possible types. In our problem, 
the user type contains the system state, zfitk ), and the model 
parameter, ipi, in particular: 

U i {x-,6 i ) = V i {a i] 0 i )-P c a i , (9) 

where = [zi(t k ), <Pi]. 

As the user preferences are private, to determine the 
optimal energy price, the coordinator also needs to require 
each user to submit a bid to reveal some information. 
Formally, this can be formulated as a message space M = 
M\ x • • • x Mm, where Mi denotes the space of messages 
(bids) the ith user can communicate to the coordinator. 
The structure of Mi depends on particular applications. For 
example, in the demonstration project, each device submits 
a price and a quantity, then we have {Pl l(I , Q\ id ) £ Mi. In 
ll24ll each device submits the slope of the demand curve, /%, 
in which case 0i £ Mi. After collecting all the user bids, the 
market is cleared with an energy price and a corresponding 
energy allocation. The clearing strategy can be represented 
by an outcome function, g : M —y X, that maps the user 
bids to an outcome, x. The message space and the outcome 
function together fully characterize the rules governing the 
procedure for making the collective choice. This is typically 
referred to as a mechanism Gil, which can be denoted as 
F= (Mr,..., M n , <?(•))• 

Each user observes 9i privately and determines what to bid 
to maximize his utility. This process can be represented by a 
bidding strategy nrii : O; —> Mi that maps the user type to a 
message. There are many solution concepts for a mechanism, 
such as Nash equilibrium, Bayesian Nash equilibrium, etc. 
Of particular interest to our framework in this paper is the 
dominant strategy equilibrium. Denote m_i as the collection 
of strategies of all the users other than i, then the dominant 
strategy equilibrium is defined as follows: 

Definition 2 (Dominant Strategy Equilibrium SHU): 

The strategy profile (mf (•),..., is a 

dominant strategy equilibrium of mechanism 
F = (Mi,..., Mjv, g(-)) if for all i and all 9i £ 0;, 
Ui(g{m*(9i),m-i),0i) > Ui(g(m'i(6i),m-i),9i) for all 
m'i(9i) £ Mi and all rri-i £ M_i. 

The equilibrium strategy characterizes the individual’s self- 
interested behavior: each user is an individual welfare max¬ 
imizer. However, in the coordinator’s point of view, a more 
interesting question is to find the best choice for the overall 
social welfare. For this reason, a social choice function 
/ : 0 —> X can be defined to represent the desired social out¬ 
come of the coordinator. More specifically, /(•) determines 
what outcome will be chosen by the coordinator when he 
knows all the private information. In our problem, / consists 


of the optimal price to the optimization problem ([6]) and the 
resulting energy allocation. If we define 9 = {9 1,... ,9m), 
the conflict between the personal interest and social interest 
can be captured by the concept of implementation: 

Definition 3 (Implementation fiTfiH ): A mechanism F = 
(Mi,..., Mm, g{-)) implements the social choice func¬ 
tion /(■) in dominant strategies if there exists a 
dominant strategy equilibrium m*(-) of F, such that 
g(rn\(9 i),.. .,m* N (0 N )) = f(9) for all 9 £ 0. 

In the above definition, g(m\{9\),..., m* N (0N)) repre¬ 
sents the resulting outcome of individual maximization, while 
f(9) denotes the desired social outcome. The concept of 
implementation characterizes the social choice that can be 
realized when all the users take actions to selfishly maxi¬ 
mize the individual utility. To this end. Problem 1 can be 
equivalently stated as follows: 

Problem 2: Design a mechanism to implement the so¬ 
cial choice function /(•) that maximizes the social wel¬ 
fare subject to a peak energy constraint, i.e., f(9) = 

{hi{P*-,6i{t k )),...,h N {P*-,6i{t k )),P*) and P* is the so¬ 
lution to the optimization problem ©. Furthermore, P* 
realizes the team optimal solution. 

In the above mechanism design problem, the coordinator 
needs to design the message space and the market clearing 
rule such that the optimal social welfare can be implemented 
when each user selfishly maximizes the individual utility. 
In the meanwhile, the peak energy constraint needs to be 
respected. 

B. Constructing the Mechanism 

Let f(9) = (a*,..., a* N , P *) be the social choice function 
that maximizes the social welfare subject to the peak energy 
constraint. Specifically, P* is the optimal solution to ©, and 
f{9) satisfies the following condition: 

a*=K(P* c -9i),fii = \,...,N. (10) 

This subsection constructs a mechanism to implement /(•). 
Consider a mechanism T*, where each device is asked 
to submit function h,;(-:9i). Since we have assumed that 
hi(P c -,9i) is continuous and non-increasing with respect to 
P c , the message space is the function space of all non¬ 
increasing and continuous functions. Notice that the user’s 
actual bids may deviate from function hi, unless they are 
motivated to bid hi. Let bi(-]9i) be a non-increasing and 
continuous function that represents the user’s actual bid. The 
aggregated demand curve b(-;9) can be obtained by adding 
individual bidding functions, i.e., &(■; 9) = YliLi &»(•; 0i)- 
this mechanism, each user is required to submit a function, 
which requires considerable communication resources. This 
bidding strategy will be simplified in the next subsection to 
reduce the communication overhead. 

Here we propose the following outcome function 


g{bi ,..., &at) = (aj,..., a^r, P*) to clear the market: 


' cl* = bi(P*; Oi) for all* = 1,..., N 
P* =ma x{P,P*} 



b(P,0) = D, 


( 11 ) 

( 12 ) 

(13) 

(14) 


where C' represents the derivative of the cost function ('(■). 
According to (fl3l) and (ITdl) . P* is the marginal production 
cost of procuring amount of energy, while P is 

the energy price at which the aggregated demand is equal 
to the maximum allowed amount. Since hi is continuous 
and non-increasing, and we have assumed that D < NE 
P exists. Intuitively, the social welfare is maximized when 
the market price equals the marginal production cost, i,e, 
P* = P*. However, in equation ( ITdl i. the function b is non¬ 
increasing with respect to price, indicating that any feasible 
price that respects the feeder capacity constraint should be 
greater than P. Therefore, in the proposed outcome function, 
the clearing price equals to P* whenever P* > P, and equals 
to P otherwise. When the energy price is determined, the 
allocation exactly follows the user bids, i.e., a* — bi(P c ; Oi). 
For illustrating purpose, we construct the following example 
to show how to derive the optimal solution from the proposed 
clearing strategy. 

Example 2: Consider 100 users with Vi = — + (i — 

P c )ai. Assume that after proper scaling, the maximum en¬ 
ergy consumption for each user is 1. The individual utility 
maximization problem can be formulated as follows: 


j. 2 / \ 

max—-cij + (* — P c )a,i (15) 

CLi Z 

subject to: 0 < di < 1 


The optimal solution to this problem is: 

{ 0 if P c > i 

1 if P c < i — 1 (16) 

i — P c otherwise . 

In addition, let us assume that the real time price is 20, and 
the maximum 5-minute energy due to the feeder capacity 
constraint is 50, i.e., P* = 20 and D = 50. According to 
(IT6l) . when P c = 99, only the 100th user consumes 1 unit 
of energy, and the aggregated energy is 1. When P c = 98, 
the 99th and the 100th user consume 1 unit of energy, 
respectively, and the corresponding aggregated energy is 2, 
and so forth. Therefore, the price that corresponds to the 
energy limit is 50, i.e., P = 50. Since P > P*, we conclude 
that P* = P. 

The rest of this subsection discusses some properties of 
the proposed mechanism. 

Proposition 1: When each user is a price taker, the strat¬ 
egy profile 0i ),..., 0^)) is a dominant strategy 

equilibrium of the proposed mechanism T*. 

This result follows easily from the price taker assumption. Its 
proof can be found in the Appendix section. In the proposed 


mechanism, the optimal bid of each user does not depend 
on the bidding decisions of others. This is a very important 
property, since in our particular problem, each user does not 
know other user’s preferences or actions. Therefore, if the 
bidding decision of one user has to depend on the action of 
another, then the equilibrium strategy can not be achieved 
unless all the users have accurate predictions on other user’s 
action, which may not be a reasonable assumption. In ad¬ 
dition, we also want to comment that Proposition 1 only 
holds when there are many users such that the influence of 
an individual user on the market price is negligible. In other 
cases (such as the oligopolistic market), the mechanism needs 
to be designed differently. 

Now we can establish the following key property of the 
proposed mechanism: 

Proposition 2: The proposed mechanism T* implements 
the social choice function /(•). Furthermore, the resulting 
market clearing price realizes the team optimal solution. 

The proof of this proposition can be found in Appendix. 


C. Realistic Bidding Strategy 

The proposed mechanism provides a general solution to 
the coordination problem formulated in this paper. In real- 
world applications, directly submitting function h.; requires 
considerable communication resources, and might impinge 
on the customer privacy. Therefore, in this subsection we 
explore the structure of function e* (•;#,) and to 

simplify the message space and reduce the communication 
overhead. 

In this paper we assume that the TCL consumes a constant 
power when it is “on”, and consumes no energy when it 
is “off”. For this reason, the energy consumption function 
6i(-. 6i(tk)) can be derived by calculating the portion of time 
that the system is on during the entire market period. For 
example, assume that the system is “on” at the end of the 
(k — l)th period. When the initial temperature rji(tk) is given, 
the state trajectory of the linear dynamic model (0 can be 
derived as rji(t) = e Ait rji(tk) + A~ 1 (e Ait — I)B on , where 

Vi(tk) = [vl 1 \tk),vl 2 \tk)] T , Vi 1 ] (tk) = Ti{t k ) and / is the 

identity matrix. When the trajectory hits the boundary of the 
control deadband defined in 0, the power state will switch 
and the system is off. Therefore, the trajectory of the system 
state rji (t) and the power state (p (t) for the entire period can 
be derived, and the portion of time that the system is “on” 
can be calculated based on qi(t). In particular, consider a 
system in cooling mode. If the load is “on” at the end of the 
(k — l)th period, i.e., qi(tZ) = 1, we have the following (the 
case for gi(fjT) = 0 can be derived similarly): 


Cj [Ui (f k) j ^iil'k)) 



if Ui{t k ) < Tj(tk) + S/2 

0 

if Ui{tk) > T l c (t k ) + S/2 

E™ x a 

1 

otherwise , 


where a = qi(t)dt = is the portion of time that the 
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Figure 9. The energy response curve hi and its approximation. 

system is on, and T' satisfies the following: 


a function space to a space of R/_, and each bid is of the 
form \ P bid’QbJ- 

Remark 2: Bidding and pricing can be viewed as informa¬ 
tion exchange between the coordinator and the loads that is 
essential for optimal decision making. Many advanced DLC 
methods also have communication requirements m-m, 
l36ft and can also accomplish certain group objectives. Some 
DLC strategies may even learn the user responses through 
the input/output user behaviors. The main difference of the 
proposed market-based approach lies in its emphasis on the 
quantitative incorporation of user preferences, the economic 
interpretation of user bids and coordination signals, and 
the encoding of internal load dynamics and user preference 
information into the bids. 


Vi(t k + T’) = e AiT 'rji(tk) + A i 1 (e AiT ' - I)B on 
< Vi^itk +T') = Ui(tk) - S/2 

Tj(t k ) is the room temperature at t k +T given that the system 
is on during the entire period between t k and t k + T, which 
satisfies the following: 

Vi(t k +T) = e AiT r]i{t k ) + A~ 1 (e AiT - I)B on 
< 4 1 \t k +T) = Tj{t k ) (17) 

,rif\t k )=T i c {t k ). 

Tj is defined in ( fTTI) to characterize the condition in which 
the load is “on” for the entire period and therefore consumes 
the maximum energy. Intuitively, if the room temperature 
at t k is less than the lower bound of the control deadband 
(T*(t k ) < Ui(t k ) — <5/2), the power state will be “off” until 
the room temperature hits the boundary of the deadband. On 
the other hand, if Ui(t k ) < Tj(t k ) + <5/2, it indicates that the 
load is always “on”, and the room temperature does not hit 
the boundary for the entire period. 

To simplify the message space, we approximate hi with 
a step function as illustrated in Fig. [9] where c\ and ci are 
computed based on the control setpoint and user type. For no¬ 
tation convenience, define c\ = ei(u\ 1 9i ) and C 2 = ei(u 2 , 6%) 
, where u± and u-2 are the temperature control setpoints 
corresponding to c\ and C 2 , respectively. For example, using 
the second-order ETP model <0} and control policy 0, tti 
and U 2 for the /th device can be obtained as: 

U =1*{t k ) + 5/2 

\ u 2 = LA~ 1 e AiT (A i r] i (t k ) + B* J - LA^B^ + 5/ 2 
[ = T}(t k ) + 5/2, 

(18) 

where L = [1,0], and the power state of the /th TCL is “on” 
at f-. 

The step function in Fig. [9] can be fully characterized 
by two scalars: P bid and Ql id , where P bid is the middle 
point of ci and C 2 , while Q\ id is the power consumption 
when the device is on during the market period. In this 
case, the message space of each user Mi is reduced from 


V. Output Based Bidding 


The proposed bidding strategy in Section IV assumes 
the knowledge of ETP model parameters. In practice these 
parameters are difficult to derive, and the ETP model used in 
the framework may be inaccurate in terms of characterizing 
the real energy consumption of the TCLs. To address these 
challenges, we present a joint state and parameter estimation 
framework, which enables users to compute bidding prices 
only based on local measurements. 

In the ETP model (0, Ai is a constant, while B l on 
and lj' 0 ff are time varying, which depend on the outside 
temperature and the solar heat gain. Let Q k £ B 2 be a vector 
denoting the outside temperature and the solar heat gain. 
We assume that Q k can be measured or estimated. When 
we have some rough statistical information about the model 
parameters, the system dynamics can be captured with an 
uncertain discrete dynamic model with Gaussian noise:: 

j 11i(t k ) = AiT]i(tk-l) + -BiCfc-1 + Ci + 

{ Vi{t k ) = Lr)i(t k ) + vi, 

where yi(t k ) is the output measurement (air temperature), 
L = [1,0] and we have two linear subsystems depending on 
the power state of the load: 


C i =\ C - on 1 ( 20 ) 

\Cl ff if (4-i) = 0. 

In this model, the dependence of the load dynamics on 
the external signal ( k is made explicit. Therefore, A ,, /),, 
C l on and C’ nn are time invariant unknown parameters. Here 
we assume that all the noise terms follow the Gaussian 
distributions: 



-ATH 

1 o,no 



-Af(vU 

0, Sj) 

(21) 

Pi ^ 

-A f(pi | 




Let r]i(t\) = rrig + pi be the initial state (p,i is the 
noise). Denote a, = [A i: Bi, C l on , C l off , fl,, m l 0 , as 
the unknown parameter to be estimated. The problem can be 
then formulated as estimating ai base on local measurements, 
Yi = ■ ■ ■, T/(Im))- This can be cast as a joint state- 

parameter estimation problem, which can be solved using 












the expectation maximization (EM) algorithm El chap. 
13]. The EM algorithm is a two-stage iterative optimization 
technique for finding the maximum likelihood solution for 
the unknown parameters. In other words, it finds the optimal 
er, that maximizes the likelihood function p{Yi\ai), where 
Yj = ..., yi(tM))- The EM algorithm starts from 

some initial selection for the model parameters, a 0 id . In the 
first stage (E-step), we evaluate the posterior distribution of 
the state p(Zi\Yi, a 0 id) assuming that all the parameters are 
known, where Zi = (»7*(ti),..., In the second stage 

(M-step), the derived posterior distribution is used to find the 
updates of <Ji that maximizes the expectation of the logarithm 
of the complete-data likelihood function, which is: 

Q(°i,croid) = y^ j p(Z i \Yi,a 0 i d )\n(p(Yi,Zi\a i )). (22) 

Zi 

After the update for the parameter estimation is derived 
in the M step, we assign it to a Q id and go back to E- 
step. This procedure is iterated until the estimation of the 
state and parameters converges. The detailed steps of the 
proposed estimation algorithm for the ETP model can be 
briefly summarized as follows: 

1) The E Step: The E step finds the distribution for the 
system state r]i(t k ) conditioned on the full observation se¬ 
quence, Yi = (j/i(fi),... ,yi(tM )), assuming that the model 
parameters are known as a 0 id- This inference problem can 
be solved efficiently using the sum-product algorithm [[371 in 
two steps: first, the distribution of state rji(t k ) conditioned 
on a partial observation sequence (j/i(fi), • • •, j/j(ffe)) can 
be derived with a Kalman filter; second, the conditional 
distribution p(r]i(t k ) \ Yi) can be found with a Kalman 
smoother. 

Denote a(r]i(t k )) as the conditional distribution p{j]i{t k ) \ 
Viiti), ..., yi(t k )), which satisfies: 

a(Vi(tk)) =A f{rii(tk) | Pk,$k), (23) 

where J\f stands for Gaussian distribution with Uk and <h k as 
its mean and covariance, respectively. In the context of linear- 
Gaussian systems, the sum-product algorithm f37l gives the 
following recursion equations: 


! /ifc — Aip,k- 1 + -BiCfc-i + Ci + K k (jjiitk) — LAifik- 1 
-LBiCk- 1 - LCj) 

$ fc = (I — K k L)Pk-i, 

(24) 

where Ci = C l on if the /th load is “on’', and Ci = if 
otherwise. P k and the Kalman gain matrix is defined as: 


Pk-1 — Ai&k-iAf + Eli 

K k = Pk^L T {LP k -xL T + Sj) -1 . 


The initial conditions for the recursion equation are given by: 


pi = Too + Ki{yi{ti) - Lm 0 ) 
§i={I-KiL)%, 


( 26 ) 


where K x = %L T (L%L T + E,) -1 . 

With the above recursion equations, we can derive the 
distribution for r]i(tk) conditioned on the observations from 
yi(ti) to Next we turn to the problem of finding 

the probability distribution for rji(tk) given all observations 
from to y,;(t A .f). Denote the conditional distribution 

p(Vi(tk ) I Yi) as 7 ( 77 i(ffc)), which satisfies: 

7(^(4)) =N(r) i(ffe) | /xfe,<£fc). (27) 

The sum-product algorithm gives the following recursion 
equations: 


j Tk — Tk T Jki.Pk+i Kipk BiC Ci) f281 

= q> fc + J fc (<| fe+1 — P k ) jJ, 

where J k = $ k AJ(P k )~ 1 . 

With the recursion equation presented above, the condi¬ 
tional distribution p(r]i(t k ) \ Yi) can be computed using 
backward induction. 

2) The M Step: The M step tries to find the parameter 
update that maximizes the logarithm of the complete-data 
likelihood function d22l) . Equation d22l) indicates that aside 
from the conditional distribution p(r]i(t k ) \ Yi, 0 o id) (already 
obtained in the E step), the likelihood function also depends 
on the joint distribution p{Zi,Yi \ Ui). For the linear- 
Gaussian system (IT9l) . the logarithm of this joint distribution 
p{Zi,Yi | at) is given by: 

M 

\np(Z l ,Y l | cr») =y^lnpfa(4) | rn(tk-i), A it B t , Ci, fl») 

k—2 

M 

+ '^2 ]n P(y i ( t k ) I Vi(t k ),L,T,i) 

k= 1 

+ lnp(7?;(fi) | m l 0 ,%), (29) 


where the dependence of the joint distribution on the un¬ 
known model parameters is made explicit. The complete- 
data likelihood function Q(<Ji, cr 0 id) can be then obtained by 
taking the expectation of ( l29l ) over Z, using the posterior 
distribution p(r]i(t k ) \ Yi,0 o id) derived in the E step. 

Let cr' = (Al , B[ , C ' ion , C' iof f ,n' i ,'E' i ,m' 0 ,$' 0 ) be the 
update of the unknown parameter in the M step, i.e., 
al = argmax CT . Q(cri, & 0 id)- The explicit formula for each 
component of a' is given as follows (please refer to [ 3_71 for 
the detailed derivation): 


(i) Maximizing d22l> over m l 0 
derived as: 


and <I>o, the updates can be 


I m' 0 =%;(ii)] 

= E[r]i(t 1 )r]i(t 1 ) T ) - E[? 7 l (fi)]E[? 7 i(fi) T ]. 

(ii) Maximizing the likelihood function d22l) over A,, the 


update of Ai is given by: 

M 

A'i = ^y^E[?7 i (4)?y i (4_i) T ] - B i £ k - 1 E[r)i(t k -i) T ] 

k=2 

M -1 

-CiE[rji(t k - i) T ]) x ^y^E[?7 i (4_i)^(t fe _i) T ]^) . 

k= 2 

(30) 

(iii) Maximizing the likelihood function (l22l) over Bi, we 
can derive the update of Bi as follows: 

b[ = (E[m{tk)] - A&imitk-!)] - Ci)Ck- 1] x 

(iv) Maximizing the likelihood function (l22l > over C l on and 
the updates are given as: 

ci on = j- E { E ivi(tk)} - mhitk-A] - b[ a_i) 

1 fceMi 

q.oft = f E ( E te(**)] - ^E^(^-t)] - B'a-i), 

2 k£M 2 

(32) 

where Mi C {1,2,..., M} denotes the time instants 
when the system is on, and i) orl represents the size of 
Mi. M 2 and $2 are defined similarly. 

(v) The update function for M can also be derived by 
maximizing the likelihood function with respect to f 1i, 
which gives: 

= M-l E {A'MVi{tk-i)Vi(tk- i) T ]^f 

k—2 

- E[r]i(t k )r]i(t k -i) T ]A' i - A' i E[r) i (t k - 1 )r]i(tk) T ] 

+ E[r H (t k )r H (t k ) T ] - (S'Cfe-i + C/)E[ Vl {t k ) T ] 

- EfofaJKCfc-iBf + Cf) + C/E[ ?7i (4_i) T ]A' T 
+ B'a-iE[r ?i (4-i) T ]if + A'E[ ??i (4-i)]C fc T _iS' T 
+ mnAtk-.W? + (BjCfc-i + C'DCfe-i^f 

+ (S'C fc -i + C')C' T }. (33) 

(vi) The update for can also be obtained similarly: 


E' =E E(?/z(4)2/i(4) T - LE[77 i (f fc )] 2 / i (4) T - 

yi{t k )E[r]i(t k ) T ]L + LE[77i(f fe )^(4) i ]L}. 

(34) 

In the above update equations, E[r/i(t k )] and 
E[r]i(t k )r]i(t k ) T } can be computed based on the conditional 
distribution p{rji(t k ) \ T)) obtained in the E step, while 
the pairwise expectation E[rji(t k )r]i{tk-i) T ] can be derived 


using Bayesian Theorem. The expressions for these 
expectations are as follows ED: 

(E[r)i(t k )] = Afc 

\ E\r]i)t k )T]i{t k — 1 ) ] — J k — l^bfc “t“ fJ'kfi'k—l (35) 

[E[rj i (t k )r]i(t k ) T } = $ fc + AfcAfc ■ 

After the update a[ of the the estimated parameter is derived, 
we assign it to cr 0 id and go back to the E step. This process 
is repeated until the update for the estimated parameters 
converges. When the estimation for <7i is obtained, each user 
can compute the bidding prices based on (ITSl) and the bidding 
curve as shown in Fig. [I] The output-based bidding algorithm 
is summarized as Algorithm 1. 


Algorithm 1 The output-based bidding algorithm 
Initialization: Initial guess of parameters <r 0 id and mea¬ 
surement sequence Y, . 

1: while convergence criteria not satisfied do 
2: Find the conditional distribution of system state 

p(Zi\Yi,cr 0 i d ) using Kalman Filter. 

3: Find Ui that maximizes the complete- 

data likelihood function Q(cri, cr 0 id) = 

Y, Zi P( Z i\ Y ii a oid)lnp(Yi, Zi\(Ji). Denote the optimal 
solution as a new 

4: Update the parameter estimation: a 0 id = & new¬ 

s'. end while 

Output: The estimation of the state trajectory Zi and the 
parameters ai. 


VI. Case Studies 

This section applies the proposed market mechanism and 
the learning scheme to the TCF coordination problem con¬ 
sidered in the GridWise® demonstration project j27l , and 
presents simulation results to demonstrate the effectiveness 
of the proposed market mechanism. 

A. Simulation Setup 

Similar to the GridWise® project, we consider a realistic 
scenario where each user is equipped with a smart thermostat 
that can measure the room temperature and communicate 
to the coordinator. At each period, the device measures the 
current room temperature and submits a bidding price based 
on a bidding curve. The coordinator collects all the user 
bids and clears the energy market with a price. Each device 
will then determine the control setpoint in response to this 
energy price, which modifies the load dynamics and affects 
the bids for the next period. This framework is validated in 
Matlab using the parameters generated in GridFab-D ll38l l. 
The details of the demonstration project can be found in 
Section II. 

The second-order ETP model is used to capture the load 
dynamics of the TCFs. The ETP model parameters depend 
on various building parameters, such as glass type, floor 
area, area per floor, glazing layers and material, etc. Detailed 
description of these parameters and their relations to the ETP 








Figure 10. Comparison of the actual power trajectory and the cleared power. 
The outside air temperature record is on August 20, 2009 in Columbus, OH. 
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Figure 11. The demand curve and the market clearing process at 08:20 
AM. When the total demand is less than the feeder capacity constraint, the 
market price is equal to the base price. 

model parameters can be found in |39l . In the simulation, 
1000 sets of building parameters are generated. A few 
important parameters are randomly generated using the same 
approach as in E), while the rest take their default values 
from GridLAB-D. Throughout the simulation, the aggregated 
power of the unresponsive loads is assumed to be 12 MW, 
and the feeder power capacity is 15MW. In addition, we 
refer to the “power” consumption of the load as the average 
power during one market period (5 minute), unless otherwise 
stated. 

We use the weather data and the Typical Meteorological 
Year (TMY2) data for Columbus, OH, obtained from Roll . 
ED, which includes air temperature and solar radiation. The 
wholesale energy price is from the PJM market l42l . It is 
modified to a retail rate in %/kWh plus a retail modifier as 
defined by AEP’s tariff l30l . and we define this retail price 
as the base price. 

B. Results and Analysis 

First the proposed mechanism is evaluated in the deter¬ 
ministic case, where each user can accurately estimate the 
unknown parameters. Each user submits a bid as described 
in Fig. Q] and the market is cleared according to the proposed 
outcome function (fi n> -(fT4 i> . Air temperature record is on 
August 20 (mild day), 2009 in Columbus, OH, and the 
aggregated power trajectory is presented in Fig. [10] 

It shows that the power trajectory is effectively capped 
below the feeder power capacity for the entire day. Notice 
that whenever the coordinator clears the market with an 
energy price, there is a corresponding power on the ag¬ 
gregated demand curve (as shown in Fig. [4] and Fig. 0. 
We call it the cleared power, which stands for the coor¬ 
dinator’s estimation on the aggregated power consumption 
before the market is cleared. Simulation results demonstrate 



Figure 12. The demand curve and the market clearing process at 04:40 PM. 
When the total demand exceeds the feeder capacity constraint, the market 
price is higher than base price to respect the feeder capacity constraint. 



Figure 13. Comparison of the trajectories of the market clearing price and 
wholesale energy price. The price is higher during the congestion, which 
effectively caps the peak energy at key times. 

that this cleared power accurately matches the actual power 
consumption. This enables the coordinator to select proper 
prices to effectively achieve the desired aggregated power 
consumption. To demonstrate how it works, we randomly 
choose two market periods and present their market clearing 
procedures in Fig. HD and Fig. HD respectively. When there 
is no power congestion, the coordinator can directly pass the 
base price to individual users. This case is shown in Fig. HD 
which corresponds to 08:20 AM in Fig. HQ] When the power 
congestion occurs, the coordinator should clear the market 
at the intersection of the aggregated demand curve and the 
curve of the feeder power constraint. This case is presented 
in Fig. H2l which corresponds to 04:40 PM. 

The trajectories of the market clearing price and the base 
price are shown in Fig. HU The figure shows that the market 
clearing price is equal to the base price when there is 
no congestion, and is higher than the base price during 
congestion hours. Notice that when C'(a) ^ Phase , the 
market price can be different from base price in uncongested 
period as well, where a is the total energy purchased from 
the wholesale market. In addition, a few price spikes can be 
observed in the simulation result. However, these spikes are 
not created by the proposed framework, but instead caused 
by the fluctuations of the base price. 

Furthermore, to evaluate the proposed mechanism in terms 
of social welfare maximization, we compare it with a base 
scenario, where Real Time Pricing (RTP) is adopted to cap 
the power in a heuristic way. More specifically, when there is 
no congestion, the market clearing price is equal to the base 
price. When the power congestion occurs, the clearing price 
is the base price multiplied by a fixed ratio 7 , which is greater 
than 1 and can cap the aggregated power below the limit 
effectively. We can run simulations to find such ratios, and 

















































Figure 14. Comparison of the social welfare of the proposed pricing strategy 
and the base scenario. The base scenario adopt RTP and multiplies the base 
price by a fixed ratio to cap the total energy. 



Figure 15. The estimation result of the output-based bidding algorithm. The 
initial guess of the simulation is randomly selected from 90% and 110% of 
its true value. 

among all the possible ratios, we choose the minimum one 
that can cap the aggregated power below the feeder capacity. 
Since the social welfare of the two scenarios will be the same 
during the uncongestion period (7 = 1 ), we use the weather 
data on August 16, 2009, where more power congestion can 
be observed due to the elevated temperature. In this case 
7 = 2 . 6 , and the social welfare of the two pricing strategies 
is shown in Fig. [14] The simulation results demonstrate that 
the proposed optimal pricing strategy always outperforms the 
base scenario in terms of social welfare. Notice that in this 
paper we run simulations for different values of 7 to find the 
minimum value that can cap the aggregated power. However, 
this ratio is difficult to derive in practice, and therefore a 
much more conservative value has to be used to operate the 
power grid safely. This will further reduce the social welfare 
of the base scenario. 

C. The Output-based Bidding Algorithm 

This subsection shows how the proposed output-based 
bidding algorithm can be used to accurately estimate the 
bidding prices. Fig. [18] shows the simulation setups for 
the proposed algorithm. In the simulation, the ETP model 
parameters are the default values in GridLab-D, which are 
generated based on some non-Gaussian distribution, while 
the process noises w l n and the measurement noises v l n are 
Gaussian. In addition, we assume that each device can locally 
measure its room temperature every minute, and store all the 
measurements for the past 6 hours, in which case M = 360. 
The algorithm is started with an initial guess a Q id with 10% 
error. In other words, each element of the initial guess o 0 id 
is generated by randomly selecting a value between 90% and 
110% of its true value. With the estimated parameters derived 
in the output-based bidding algorithm, each user can compute 
the bidding prices based on the realistic bidding strategies 


Figure 16. The actual power trajectory and the cleared power based on the 
demand curve with 2% bidding error. The outside air temperature record is 
on August 20, 2009 in Columbus, OH. 



Figure 17. The aggregated power and the cleared power under real time 
pricing strategy. The base price is directly passed to the retail market. 

shown in Fig. [9] Here we choose a random user in a random 
market period and present its estimation result in Fig. [15] 
In this figure, the estimated bidding price is derived from 
the output-based bidding algorithm, while the true bidding 
price is computed based on the true value of the unknown 
parameters. It can be seen that the estimated bid closely 
follows the true value, with an average estimation error less 
than 1 %. 

When all the users apply the output-based bidding al¬ 
gorithm to compute the biding prices, an error (of less 
than 1%) will be introduced. Now we evaluate how this 
estimation error can affect the aggregated power response. 
To implement the estimation framework, each device will 
locally perform the output-based bidding algorithm during 
each market period, which just takes 5.5 seconds on a laptop 
with 2.5GHz Intel i5 processor and 8 G memory. However, it 
is computationally intensive to do the centralized simulations 
for all the users over 24 hours to show how the estimation 
error affects the aggregated power response. For this reason, 
instead of directly incorporating the output-based bidding 
algorithm in individual simulations, we add a simulated error 
of 2 % (this simulated error is larger than the actual error of 
the output-based bidding algorithm) to each user’s bidding 
price. The simulation results with this bidding error are 
presented in Fig. [16] It can be seen that the aggregated power 
is effectively capped below the feeder capacity during 84% 
of the time. In the cases where the feeder capacity constraint 
is violated, the aggregated power exceeds the power limit 
by 1 . 1 % on average, and the maximum violation occurred at 
4:15 PM, where the power limit is exceeded by 3.3%. Notice 
that these small violations can be easily fixed by adjusting 
the feeder capacity constraint in the formulation to be slightly 
more conservative. 
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Figure 18. Illustration for the out-based bidding algorithm. The ETP model 
parameters are default values in GridLAB-D generated according to a non- 
Gaussian distribution. 



Figure 19. The market clearing strategy during power congestion under 
real time pricing. The market clearing point violates the feeder capacity 
constraint. 


Figure 20. The actual power trajectory and the feeder capacity. The outside 
air temperature record is on August 16, 2009 in Columbus, OH. 
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Figure 21. The influence index of a randomly chosen user over different 
population sizes. The influence of individual bid drops rapidly as the 
population size grows. 


D. Comparison with Other Strategies 

In this subsection we first compare the proposed mecha¬ 
nism with RTP 0. RTP can incentivize users to shift demand 
from high price periods to low price periods to reduce elec¬ 
tricity expenditures. However, as such an approach directly 
passes the base price to the retail market, it can not achieve 
predictable and reliable aggregated power response, which is 
essential in many demand response programs. To illustrate 
these limitations, we compare our framework with RTP by 
applying RTP in the considered problem. In this simulation 
the coordinator clears the market by directly passing the 
base price to individual users, and the devices respond to 
the energy price according to the response curve described 
in Fig. [6] Except for the pricing strategy, all the parameters 
are the same as in the simulation in Section III-B, and the 
result is presented in Fig. [T7] When there is no congestion, 
the real time pricing scheme has the same performance as 
the proposed mechanism, and efficient energy allocation can 
be achieved. However, during the power congestion period, 
the RTP method can not prevent the aggregated power from 
exceeding the feeder power limit. For instance, the market 
clearing process at 4:40 PM is presented in Fig. [T9] In 
the example, due to the increased power demand in the 
afternoon, the market clearing point exceeds the feeder power 
limit. This issue can be solved in the proposed mechanism 
with an elevated energy price during power congestion. We 
emphasize that the proposed mechanism may also fail to cap 
the energy in certain extreme case. For instance, when the 
outside temperature is extremely high and all the participating 
TCLs have very small thermal capacity and resistance, then 
it is possible that a large number of users have to turn on the 
TCL for the entire market period, and the aggregated energy 


can not be capped effectively. However, this is not due to 
the proposed mechanism, but rather because of the physical 
limitation of the system. 

In addition, the proposed mechanism is also compared 
with the original Gridwise® demonstration project (base 
scenario). In the simulation, the market clearing strategies 
of the two cases are the same, while the bidding strategies 
are different. In the base scenario, each device submits a bid 
based on the current temperature, while the device in the 
proposed mechanism computes the bid according to Fig. [9] 
Except for the bidding strategy, all the parameters are the 
same as in the simulation in Section III-B, and the result 
of the base scenario is presented in Fig. [20] In this case, 
the user bids only depend on the room temperature, and 
the information regarding the model parameter is missing. 
Therefore, although the same pricing strategy is applied, the 
coordinator still can not achieve the desired aggregated power 
response. 

E. Impact of Some Key Parameters 

This subsection discusses how a few important parameters 
can affect the performance of the proposed mechanism. 

1) Number of Households: In this paper, we assume that 
every user is a price taker: the bid of an individual user 
can not affect the market price. Theoretically, this assump¬ 
tion only holds when the market satisfies some regulatory 
conditions, such as sufficiently many users, free entry, ho¬ 
mogeneous good, etc uni Chap. 12], In this subsection we 
use numerical simulations to investigate to what extend this 
assumption can be justified. In particular, we simulate the 
influence of individual bid on the market price, and explore 
how this influence changes with the growing number of 
participating households. This can be done by perturbing 








































Figure 22. The hourly temperature used in the simulation. The record is 
from August 16 (hot day) and August 20 (mild day), 2009 in Columbus, 
OH. 



Figure 23. Comparison of the actual power trajectory and the cleared power. 
The outside air temperature record is on August 16, 2009 in Columbus, OH. 

the bidding price of a user i, and see how the market price 
changes with this perturbation while the bids of all the other 
users remain the same. It can be verified that under the 
proposed clearing strategy, the user bid could only affect 
the market price in two possible ways: change the market 
price by a fixed value (regardless of how big the perturbation 
is) or no influence at all. To quantitatively represent this 
market influence, we define an influence index, which is 
the maximum market price change (in percentage) that a 
user could incur by perturbing its bidding price. When the 
individual bid has no influence on the market clearing price, 
the influence index is zero, and the price-taker assumption 
holds. 

The simulation can be done in the following steps. First, 
randomly choose a group of users (for example, 100 users) 
and a market period, simulate the market bidding and clearing 
process, and derive the corresponding market clearing price. 
Second, choose one user from this group, perturb his bid, and 
rerun the market clearing process to obtain another market 
price. Third, compute the influence index based on the market 
clearing prices derived from the first two steps. Fourth, 
enlarge the group and repeat all the procedures described 
above. Notice that when there is no congestion (as illustrated 
in Fig. HD, the clearing price is always the base price, and the 
influence index is 0. Therefore, all the simulations are done 
in a market period during which power congestion occurs. 
In addition, to enforce a fair comparison, we assume that the 
feeder capacity constraint changes according to the number of 
participating household. For example, if the maximum power 
of each air conditioning load is 5kW, and there are N loads 
in the project, then the maximum aggregated power is 5TV 
kW, and the feeder power capacity is 60% of the maximum 
power, i.e., 3 N kW. 



Figure 24. The market prices of the hot day and the mild day. The average 
price of the hot day is higher than that of the mild day. 



Figure 25. The estimation result of the output-based bidding algorithm 
when the initial guess is randomly selected from 50% and 150% of its true 
value. 

The simulation result is shown in Fig. [2]] It starts with 10 
users and the inference index in this case is around 35%. This 
influence drops rapidly as the number of the participating 
loads increases. When there are more than 200 loads, the 
influence index is always less than 1%. When the population 
size is larger than 500, the inference index is less than 0.4%, 
in which case the influence of individual users on the market 
price can be safely neglected. 

2) Weather Information: Aside from the number of par¬ 
ticipating households, the outside temperature data is also 
an important parameter that affects the performance of the 
proposed mechanism. The high temperature period can sig¬ 
nificantly increase the aggregated power demand of the air 
conditioning loads, and therefore cause more power conges¬ 
tion. For this reason, we evaluate the proposed method with a 
different temperature record. The data is obtained from [40] 
on August 16 (hot day), Columbus, OH, as shown in Fig. 

M 

The power trajectory and the market clearing prices are 
presented in Fig. [23] and Fig. [24] respectively. Since the 
elevated temperature increases the power demand, much 
more power congestion can be observed during the hot day. 
Despite the power congestion, the simulation result shows 
that the proposed framework can still effectively cap the 
aggregated power below the power limit, and the actual 
power accurately matches the planned power. 

3) Initial Guess of the output-based bidding algorithm: 
The initial guess of the output-based bidding algorithm is 
also crucial to performance of the estimation result. In 
our previous simulations, the initial guess is generated by 
randomly selecting a value between 90% and 110% of 
its true value. Therefore, to implement the output-based 
bidding algorithm, we need to assume that users have some 


























Figure 26. The estimation result of the output-based bidding algorithm 
when the process noise and measurement noise are generated based on a 
uniform distribution. 


prior knowledge of the unknown parameters to guarantee 
that initial guess is within this range (from 90% to 110% 
of the true value). In this subsection we explore to what 
extend we can relax this assumption without compromising 
the estimation performance. In particular, we use the same 
model parameters as in Section III-C, and test the proposed 
algorithm with an error of 50%. The estimation result is 
shown in Fig. [25] which shows that the output-based bidding 
algorithm can accurately estimate the bidding prices even 
with 50% error on the initial guess. 

4) Model noises: The proposed EM algorithm is devel¬ 
oped mainly based on the linear Gaussian model (TT9l i. where 
we assume that B l on and B' a f f can be decomposed into 
three parts: the external signals, a constant and a Gaussian 
process noise. However, we emphasize that in our particular 
problem, the process noise does not have to be Gaussian, 
and the proposed EM algorithm can be extended to a much 
broader class of “real” dynamical systems. To support this 
argument, the proposed EM algorithm is tested with non- 
Gaussian noises. In particular, in the simulation the process 
noises and the measurement noises are generated based on a 
uniform distribution, while the rest of the model parameters 
are generated the same way as described in Section III-C. 
The simulation result is presented in Fig. [26] where the 
estimated bidding prices are close to the real bidding prices. 
The key reason for this result is that in our problem, the 
EM algorithm does not need to accurately estimate all the 
unknown parameters (such as Ai,Bi and Cf). Instead, we 
only need to estimate the bidding price, which is a scalar¬ 
valued function of these unknown parameters. According to 
Cl, the bidding price mainly depends on the current room 
temperature T*(tk) and the room temperature 5 minutes after 
ifc. Since we have measurements of the room temperature for 
the past 6 hours in the algorithm, this information is already 
contained in these measurements except for the last 5 minutes 
of the 6 hour period. Therefore, although the algorithm can 
not converge to all the true unknown parameters, it does 
converge to the true bidding price under non-Gaussian noise 
distributions. 


VII. CONCLUSION 

This paper presents a market mechanism for the coordina¬ 
tion of thermostatically controlled loads, where a coordinator 
manages a group of TCLs using pricing incentives to maxi¬ 
mize the social welfare subject to a peak energy constraint. In 


the paper, a mechanism is proposed to implement the desired 
social choice function in dominant strategy equilibrium. This 
mechanism consists of a novel bidding strategy that incorpo¬ 
rates information on both the load dynamics and the time- 
varying user preferences. It is proven that under the proposed 
mechanism, the coordinator can not only maximize the social 
welfare but also realize the team optimal solution. Future 
work includes formulating the fully dynamic market-based 
coordination framework with multiple periods and extending 
the results to energy storage devices and deferrable loads 
such as plug-in electric vehicles, washers, dryers, among 
others. 


Appendix 

A. Proof of Proposition 1 

When each device submits hi as the bid, we have 
bjf -: Of) — hi( •; Of). According to OU. each user will receive 
an energy allocation that satisfies a* = /u(P c ;0,;). Based 
on ©, we have: a* = argmax 0 < a .< E m Vifa^Of) - P c cii. 
Therefore, when bi{- m ,6f) = hif\6f), the resulting energy 
allocation maximizes the utility of each user. According to 
Definition 2, the strategy profile (9i),..., On)) is 
a dominant strategy equilibrium of the proposed mechanism. 


B. Proof of Proposition 2 

Notice that the social choice function characterizes the 
optimal solution to the coordinator’s optimization problem 
©, and the team solution provides an upper bound on the 
social welfare for <[ 6 ]». Therefore, to prove Proposition 2, it is 
sufficient to show that the proposed pricing strategy realizes 
the team solution. 

Based on Proposition 1, bi = hi. Therefore, we have the 
following relations: 


a* = hi(Pf-,0i), for all i = 1,... ,N 
P* = max{P, P*} 

P*=C (Eti <) 

hi{P, 0) = D. 


(36) 


In addition, the KKT condition for the ith user’s individual 
utility maximization problem (|T]i is as follows: 


-V'(a*-,0i)+Pf+u\-ui = 0, 


(37) 


where u\ and uf are the Lagrangian multiplier satisfying: 

{ u\ > 0, u\ > 0 

u\ = 0 if a* A E™ (38) 

4=0 if a* A 0 . 

Define u = P*—C' then equation (l37l > becomes: 

- Vf(a*-,0i) + C' + u + u\ - 4 = 0 , (39) 

According to (l36i l. when a* < D, we have P* = P* = 
C' 4)’ therefore, u = 0. When a* = D, we 







have P* = P, and therefore, u = p — p*. Since h, is non¬ 
increasing, we have u > 0. This indicates that u, u\ and u\ 
are the Lagrangian multipliers of the team problem, and d39l > 
is exactly the KKT condition for the team problem ©. Since 
the team problem is a concave optimization problem, the 
KKT conditions are also sufficient. Thus a* = (a*,..., a* N ) 
is the team solution. This completes the proof. 
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